Techniques for Estimating the Ideal Binary Mask

نویسندگان

  • Yi Hu
  • Philipos C. Loizou
چکیده

This paper provides a comparison of binary mask estimation techniques, based on different ways of estimating the instantaneous SNR. The effect of six different gain functions and three noise estimation algorithms on estimating the SNR, and subsequently the binary mask was assessed. New criteria are proposed for classifying time-frequency bins as belonging to the target or masker signals. Sentences from the NOIZEUS corpus embedded at 0-10 dB SNR levels in four types of noise were used for evaluation. Performance of the binary mask estimation algorithms was evaluated in terms of hit rate and false alarm. Results indicated that the use of different SNR estimation techniques affects primarily the false alarm rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asr-driven Binary Mask Estimation for Robust Automatic Speech Recognition

Additive noise has long been an issue for robust automatic speech recognition (ASR) systems. One approach to noise robustness is the removal of noise information through segregation by binary time-frequency masks; each time-frequency unit in a spectro-temporal representation of the speech signal is labeled either noise-dominant or signal-dominant. The noise-dominant units are masked and their e...

متن کامل

Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.

The application of the ideal binary mask to an auditory mixture has been shown to yield substantial improvements in intelligibility. This mask is commonly applied to the time-frequency (T-F) representation of a mixture signal and eliminates portions of a signal below a signal-to-noise-ratio (SNR) threshold while allowing others to pass through intact. The factors influencing intelligibility of ...

متن کامل

Blind Dereverberation of Audio Signals

This project examines the problem of single channel blind dereverberation. After estimating the T60 value, a time-domain binary masking approach was used to remove regions of the signal that were largely dominated by reverberant energy. Performance of the system was examined for several different classes of audio (hand clapping, drums, and speech) and for varying amounts of reverberation. In ad...

متن کامل

A data-driven approach for estimating the time-frequency binary mask

The ideal binary mask, often used in robust speech recognition applications, requires an estimate of the local SNR in each timefrequency (T-F) unit. A data-driven approach is proposed for estimating the instantaneous SNR of each T-F unit. By assuming that the a priori SNR and a posteriori SNR are uniformly distributed within a small region, the instantaneous SNR is estimated by minimizing the l...

متن کامل

Estimation of the Ideal Binary Mask Using Directional Systems

The ideal binary mask is often seen as a goal for time-frequency masking algorithms trying to increase speech intelligibility, but the required availability of the unmixed signals makes it difficult to calculate the ideal binary mask in any real-life applications. In this paper we derive the theory and the requirements to enable calculations of the ideal binary mask using a directional system w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008